The Theory of Control Applied to the Prague Dependency Treebank (PDT)
نویسندگان
چکیده
One of the most difficult issues within corpora annotation on an underlying syntactic level is the restoration of nodes omitted in the surface shape of the sentence, but present on the underlying or deep syntactic level. In the present paper we concentrate on such type of nodes which are omitted due to the phenomenon usually called grammatical control with regard to their respective anaphoric relations. In particular, we extend the notion of control to nominalization and demonstrate how this relation is captured in the Prague Dependency Treebank. The theory of control is present within Chomskys framework of Government and Binding (using the terms verb of control, controller and controllee, cf. Chomsky, 1980), but also within many other formal frameworks, e.g. GPSG (Sag and Pollard, 1991) or categorial grammar (Bach, 1979). We analyse this phenomenon within the framework of the dependency grammar, theoretically based on the Functional Generative Description (FGD, cf. Sgall, Hajičová and Panevová, 1986). In FGD, on the underlying or tectogrammatical level, control is a relation of an obligatory or an optional referential dependency between a controller (antecedent) and a controllee (empty subject of the nonfinite complement (= controlled clause)). The controller is one of the participants in the valency frame of the governing verb (Actor (ACT), Addressee (ADDR), or Patient (PAT)). The controlled clause functions also as a filler of a dependency slot in the valency frame of the governing verb, being labeled as Patient or Actor. The empty subject of the controlled clause may have the function of different dependency relations to its head word (the infinitive): Actor, or, with passivization of the controlled clause, Addressee or Patient (cf. Koktová, 1992).
منابع مشابه
Valency in the Prague Dependency Treebank: Building the Valency Lexicon
In this article we focus on valency, which belongs to the core phenomena being captured in the underlying level of the Prague Dependency Treebank (PDT). We present a summary of the basic principles of the applied theoretical framework including proposals for suitable refinement relevant to NLP. The current status of description of valency behavior of verbs, nouns and adjectives is outlined. We ...
متن کاملInformation Structure with the Prague Arabic Dependency Treebank
The issue of information structure in language has been studied extensively both in the Prague School of Linguistics (Mathesius, 1929) and in the Functional Generative Description (FGD), one of the modern theories of representation of linguistic meaning (Sgall, 1967; Sgall et al., 1986; Hajičová and Sgall, 2003, 2004). In its entirety, FGD constitutes the framework for a family of projects in c...
متن کاملNouns as Components of Support Verb Constructions in the Prague Dependency Treebank
Support Verb Constructions (SVCs) are combinations of a noun denoting an event or a state and a lexical verb. From the semantic point of view, the noun seems to be a part of a complex predicate rather than the object (or subject) of the verb, despite what the surface syntax suggests. The meaning is concentrated in the noun component, whereas the semantic content of the verb is reduced or genera...
متن کاملAnnotating Extended Textual Coreference and Bridging Relations in the Prague Dependency Treebank
This technical report describes the project of manual annotation of extended textual coreference and bridging relations, which runs at the Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University in Prague, since 2009. It contains the typology of coreference and bridging relations, classification of elements that are annotated for coreference and the a...
متن کاملA Prague Markup Language profile for the SemTi-Kamols grammar model
In this paper we demonstrate a hybrid treebank encoding format, derived from the dependency-based format used in Prague Dependency Treebank (PDT). We have specified a Prague Markup Language (PML) profile for the SemTiKamols hybrid grammar model that has been developed for languages with relatively free word order (e.g. Latvian). This has allowed us to exploit the tree editor TrEd that has been ...
متن کامل